add benchmarks using pytest-benchmark and codspeed #3562

d-v-b · 2025-10-30T11:51:24Z

since #3554 was an unpopular direction I'm going instead with codspeed + pytest-benchmark. Opening as a draft because I haven't looked into how codspeed works at all, but I'd like people to weigh in on whether these initial benchmarks make sense. Naturally we can add more specific ones later, but I figured just some bulk array read / write workloads would be a good start.

d-v-b · 2025-10-30T14:34:53Z

@zarr-developers/steering-council I don't have permission to register this repo with codspeed. I submitted a request to register it, could someone approve it?

normanrz · 2025-10-30T17:44:35Z

@zarr-developers/steering-council I don't have permission to register this repo with codspeed. I submitted a request to register it, could someone approve it?

done

d-v-b · 2025-10-30T19:54:52Z

does anyone have opinions about benchmarks? feel free to suggest something concrete. Otherwise, I think we should take this as-is and deal with later benchmarks (like partial shard read / writes) in a subsequent pr

codspeed-hq · 2025-10-30T20:04:09Z

CodSpeed Performance Report

Congrats! CodSpeed is installed 🎉

🆕 30 new benchmarks were detected.

You will start to see performance impacts in the reports once the benchmarks are run from your default branch.

Detected benchmarks

test_write_array[memory-Layout(shape=(1000000,), chunks=(1000,), shards=(1000,))-gzip] (WallTime): 1.9 s
test_write_array[memory-Layout(shape=(1000000,), chunks=(1000,), shards=None)-gzip] (WallTime): 888.4 ms
test_write_array[memory-Layout(shape=(1000000,), chunks=(1000,), shards=(1000,))-None] (WallTime): 1.4 s
test_write_array[memory-Layout(shape=(1000000,), chunks=(1000,), shards=None)-None] (WallTime): 486.1 ms
test_write_array[memory-Layout(shape=(1000000,), chunks=(100,), shards=(1000000,))-None] (WallTime): 9.5 s
test_write_array[local-Layout(shape=(1000000,), chunks=(1000,), shards=None)-None] (WallTime): 982.2 ms
test_write_array[local-Layout(shape=(1000000,), chunks=(1000,), shards=None)-gzip] (WallTime): 1.4 s
test_write_array[local-Layout(shape=(1000000,), chunks=(1000,), shards=(1000,))-gzip] (WallTime): 2.8 s
test_write_array[local-Layout(shape=(1000000,), chunks=(1000,), shards=(1000,))-None] (WallTime): 2.4 s
test_read_array[local-Layout(shape=(1000000,), chunks=(100,), shards=(1000000,))-None] (WallTime): 3.3 s
test_slice_indexing[(slice(10, -10, 4), slice(10, -10, 4), slice(10, -10, 4))-memory] (WallTime): 223.8 ms
test_read_array[memory-Layout(shape=(1000000,), chunks=(1000,), shards=None)-None] (WallTime): 303.6 ms
test_read_array[memory-Layout(shape=(1000000,), chunks=(1000,), shards=None)-gzip] (WallTime): 552.7 ms
test_slice_indexing[(slice(None, 10, None), slice(None, 10, None), slice(None, 10, None))-memory] (WallTime): 795 µs
test_read_array[local-Layout(shape=(1000000,), chunks=(100,), shards=(1000000,))-gzip] (WallTime): 5.7 s
test_read_array[memory-Layout(shape=(1000000,), chunks=(1000,), shards=(1000,))-gzip] (WallTime): 1.2 s
test_slice_indexing[(slice(None, None, None), slice(0, 3, 2), slice(0, 10, None))-memory] (WallTime): 3.9 ms
test_write_array[memory-Layout(shape=(1000000,), chunks=(100,), shards=(1000000,))-gzip] (WallTime): 13.4 s
test_slice_indexing[(0, 0, 0)-memory] (WallTime): 768.4 µs
test_write_array[local-Layout(shape=(1000000,), chunks=(100,), shards=(1000000,))-None] (WallTime): 9.6 s
...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

dcherian · 2025-10-30T20:31:18Z

.github/workflows/codspeed.yml

+      uses: CodSpeedHQ/action@v4
+      with:
+          mode: instrumentation
+          run: hatch run test.py3.11-1.26-minimal:run-benchmark


can we test the latest instead? seems more appropriate...

The latest version of python? What's the reasoning? I'd rather update this file when we drop a supported version vs when a new version of python comes out.

Because we'd want to catch a perf regression from upstream changes too? I'm suggested latest version of released libraries py=3.13, np=2.2

we don't have an upper bound on numpy versions, so I don't think this particular workflow will help us catch regressions from upstream changes -- we would need to update this workflow every time a new version of numpy is released. IMO that's something we should do in a separate benchmark workflow. This workflow here will run on every PR, and in that case the oldest version of numpy we support seems better.

we also don't have to use a pre-baked hatch environment here, we could define a dependency set specific to benchmarking. but my feeling is that benchmarking against older versions of stuff gives us a better measure of what users will actually experience.

dcherian · 2025-10-30T20:32:13Z

feel free to suggest something concrete

indexing please. that'll exercise the codec pipeline too.

a peakmem metric would be good to track also, if possible.

…y params

d-v-b · 2025-10-31T20:03:35Z

feel free to suggest something concrete

indexing please. that'll exercise the codec pipeline too.

a peakmem metric would be good to track also, if possible.

I don't think codspeed or pytest-benchmark do memory profiling. we would need https://pytest-memray.readthedocs.io/en/latest/ or something equivalent for that.

and an indexing benchmark sounds like a great idea but I don't think I have the bandwidth for it in this pr right now

…into chore/benchmarks

d-v-b · 2025-11-03T12:30:57Z

new problem: the codspeed CI benchmarks are way too slow! the benchmark suite runs in 90s locally, and It's taking over 40m to run in CI. Help would be appreciated in speeding this up.

d-v-b · 2025-11-03T15:12:37Z

owing to the large number of syscalls in our benchmark code, codspeed recommended using the walltime instrument instead of their virtual CPU instrument. But to turn on the walltime benchmark, we would need to run our benchmarking code on codspeed's servers, which is a security risk.

Given that codspeed is not turning out to be particularly simple, I am inclined to defer the codspeed CI stuff for later work. But if someone can help get the test runtime down, and / or we are OK running our benchmarks on codspeed's servers, then maybe we can get that sorted in this PR.

.github/workflows/codspeed.yml

Co-authored-by: Max Jones <[email protected]>

d-v-b · 2025-11-05T19:00:25Z

looks like the walltime instrument is working! I think this is g2g

joshmoore · 2025-12-20T20:38:34Z

(Enabled the app)

maxrjones

IMO it'd be better to skip the tests/benchmarks during regular test runs in the interest of speed

docs/contributing.md

Co-authored-by: Max Jones <[email protected]>

…chore/benchmarks

d-v-b · 2025-12-21T14:32:47Z

IMO it'd be better to skip the tests/benchmarks during regular test runs in the interest of speed

i think this makes sense -- on my workstation the current benchmark suite takes 40s to run as regular tests, which is a big addition to our total test runtime. The latest changes to this branch skip the test/benchmarks folder by default when running our main test suite and the gpu tests.

add benchmarks

ef5db51

github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Oct 30, 2025

d-v-b added 3 commits October 30, 2025 13:05

remove failing zipstore

4c2a935

don't do benchmarking in default pytest runs

3e5c6cb

changelog

fc3388d

github-actions bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Oct 30, 2025

d-v-b added 4 commits October 30, 2025 13:17

codspeed workflow

da64194

lint

ba2c4cf

remove pedantic mode

009f739

only run benchmarks in one environment

7b26982

d-v-b marked this pull request as ready for review October 30, 2025 13:46

d-v-b requested review from dcherian and jhamman and removed request for jhamman October 30, 2025 14:37

d-v-b mentioned this pull request Oct 30, 2025

writing to a sharded array is 10x slower #3560

Closed

dcherian reviewed Oct 30, 2025

View reviewed changes

use better string id for test params, make test data 1MB, and simplif…

c0342ee

…y params

d-v-b added 6 commits October 31, 2025 21:04

move layout to an external file

800b64c

get workloads to resemble recent sharding perf tests

f86291b

Merge branch 'main' of https://github.com/zarr-developers/zarr-python …

dd76776

…into chore/benchmarks

test ids

90502a6

Merge branch 'main' of https://github.com/zarr-developers/zarr-python …

5e0c228

…into chore/benchmarks

tweak tests

f613b29

d-v-b added 2 commits November 3, 2025 13:38

use --codspeed flag in benchmark ci

3033fbf

measure walltime in ci

4394226

Merge branch 'main' into chore/benchmarks

7e4ca58

maxrjones reviewed Nov 5, 2025

View reviewed changes

.github/workflows/codspeed.yml Show resolved Hide resolved

d-v-b and others added 2 commits November 5, 2025 17:22

Merge branch 'main' into chore/benchmarks

27ca426

Update .github/workflows/codspeed.yml

93ea2e2

Co-authored-by: Max Jones <[email protected]>

Merge branch 'main' into chore/benchmarks

2b2d809

d-v-b requested review from dcherian and maxrjones November 16, 2025 14:59

d-v-b added 6 commits November 17, 2025 20:16

Merge branch 'main' into chore/benchmarks

da780f9

Merge branch 'main' into chore/benchmarks

7dfde77

Merge branch 'main' into chore/benchmarks

80a4b19

Merge branch 'main' into chore/benchmarks

6a33dc3

Merge branch 'main' into chore/benchmarks

cc1fe6d

Merge branch 'main' into chore/benchmarks

3678e01

d-v-b requested a review from normanrz December 19, 2025 13:49

maxrjones requested changes Dec 20, 2025

View reviewed changes

docs/contributing.md Outdated Show resolved Hide resolved

d-v-b and others added 4 commits December 21, 2025 15:17

Apply suggestion from @maxrjones

2668cd2

Co-authored-by: Max Jones <[email protected]>

add --ignore option to main test and gpu test invocations

c19f9f1

add comment

668247c

Merge branch 'chore/benchmarks' of github.com:d-v-b/zarr-python into …

0f27a2a

…chore/benchmarks

Merge branch 'main' into chore/benchmarks

09f86ef

maxrjones approved these changes Jan 7, 2026

View reviewed changes

Merge branch 'main' into chore/benchmarks

bac8726

d-v-b mentioned this pull request Jan 12, 2026

Memory profiling and indexing arrays with shards #3641

Open

Uh oh!

add benchmarks using pytest-benchmark and codspeed #3562

Are you sure you want to change the base?

add benchmarks using pytest-benchmark and codspeed #3562

Conversation

d-v-b commented Oct 30, 2025

Uh oh!

d-v-b commented Oct 30, 2025

Uh oh!

normanrz commented Oct 30, 2025

Uh oh!

d-v-b commented Oct 30, 2025

Uh oh!

codspeed-hq bot commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Congrats! CodSpeed is installed 🎉

Detected benchmarks

Uh oh!

dcherian Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

d-v-b Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

dcherian Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

d-v-b Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

dcherian commented Oct 30, 2025

Uh oh!

d-v-b commented Oct 31, 2025

Uh oh!

d-v-b commented Nov 3, 2025

Uh oh!

d-v-b commented Nov 3, 2025

Uh oh!

Uh oh!

d-v-b commented Nov 5, 2025

Uh oh!

joshmoore commented Dec 20, 2025

Uh oh!

maxrjones left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

d-v-b commented Dec 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codspeed-hq bot commented Oct 30, 2025 •

edited

Loading

dcherian Oct 31, 2025 •

edited

Loading